[RLlib] Fix IMPALA/APPO learning behavior: Fix EnvRunner sync bug, GPU loader thread, enable local learner w/ GPU. #48314

sven1977 · 2024-10-29T14:05:27Z

Enhance IMPALA/APPO learning behavior:

Fix EnvRunner sync bug (when running with lots of EnvRunners, only those few that - by chance - return samples in the same call to training_step as IMPALA/APPO performs the Learner update, get weight-synched. All others (even though they also contributed to the train batch) stay behind, leading to a very large off-policyness.
GPU loader thread: Disable loading to the GPU on the NumpyToTensor connector. This load process should be handled entirely by the n GPU loader threads.
Enable local learner w/ GPU: When using a local Learner (num_learners=0), it should already use the GPU, if available.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <[email protected]>

…joco 3.2…" This reverts commit d782b84.

Signed-off-by: sven1977 <[email protected]>

…impala_gpu_loader_thread_and_local_learner

… into fix_impala_gpu_loader_thread_and_local_learner

Signed-off-by: sven1977 <[email protected]>

aslonnie

seems that it has merge conflicts now?

Signed-off-by: sven1977 <[email protected]>

…impala_gpu_loader_thread_and_local_learner

Signed-off-by: sven1977 <[email protected]>

…impala_gpu_loader_thread_and_local_learner

Signed-off-by: sven1977 <[email protected]>

…impala_gpu_loader_thread_and_local_learner

Signed-off-by: sven1977 <[email protected]>

…accumulation_of_results_in_algorithm Signed-off-by: sven1977 <[email protected]> # Conflicts: # rllib/core/learner/tests/test_learner_group.py

Signed-off-by: sven1977 <[email protected]>

…U loader thread, enable local learner w/ GPU. (ray-project#48314) Signed-off-by: JP-sDEV <[email protected]>

…U loader thread, enable local learner w/ GPU. (ray-project#48314) Signed-off-by: mohitjain2504 <[email protected]>

sven1977 and others added 6 commits October 28, 2024 18:15

wip

54579d5

Signed-off-by: sven1977 <[email protected]>

Revert "Revert "[RLlib] Upgrade to gymnasium 1.0.0 (ale_py 0.10.1, mu…

8443dcb

…joco 3.2…" This reverts commit d782b84.

wip

ab2b22c

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into fix_…

43bd52f

…impala_gpu_loader_thread_and_local_learner

Merge branch 'revert-48297-revert-45328-upgrade_gymnasium_to_1_0_0a1'…

716c241

… into fix_impala_gpu_loader_thread_and_local_learner

wip

a967fd4

Signed-off-by: sven1977 <[email protected]>

sven1977 requested review from a team, maxpumperla, simonsays1980, richardliaw, edoakes and aslonnie as code owners October 29, 2024 14:05

sven1977 assigned simonsays1980 Oct 29, 2024

sven1977 added rllib RLlib related issues rllib-algorithms An RLlib algorithm/Trainer is not learning. rllib-gpu-multi-gpu RLlib issues that's related to running on one or multiple GPUs rllib-newstack labels Oct 29, 2024

aslonnie reviewed Oct 29, 2024

View reviewed changes

sven1977 added 11 commits October 29, 2024 18:43

wip

bc17c93

Signed-off-by: sven1977 <[email protected]>

wip

17c6bad

Signed-off-by: sven1977 <[email protected]>

wip

ee208a0

Signed-off-by: sven1977 <[email protected]>

wip

bef9e1f

Signed-off-by: sven1977 <[email protected]>

wip

43b9ba6

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into fix_…

317875a

…impala_gpu_loader_thread_and_local_learner

wip

b2aebd1

Signed-off-by: sven1977 <[email protected]>

wip

c403ffe

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into fix_…

bde9583

…impala_gpu_loader_thread_and_local_learner

wip

e576ebe

Signed-off-by: sven1977 <[email protected]>

wip

7396518

Signed-off-by: sven1977 <[email protected]>

sven1977 added 2 commits November 5, 2024 20:57

Merge branch 'master' of https://github.com/ray-project/ray into fix_…

f3c0352

…impala_gpu_loader_thread_and_local_learner

wip

051c3bc

Signed-off-by: sven1977 <[email protected]>

sven1977 enabled auto-merge (squash) November 5, 2024 20:19

fix

cebbec1

Signed-off-by: sven1977 <[email protected]>

github-actions bot disabled auto-merge November 5, 2024 22:50

sven1977 added 2 commits November 6, 2024 06:46

fix

fa07017

Signed-off-by: sven1977 <[email protected]>

merge

0e34fd9

Signed-off-by: sven1977 <[email protected]>

sven1977 requested review from scottjlee, bveeramani, raulchen, stephanie-wang, omatthew98, alexeykudinkin and srinathk10 as code owners November 6, 2024 05:55

sven1977 added 6 commits November 6, 2024 07:00

fix

07faf22

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into fix_…

a1f68b1

…accumulation_of_results_in_algorithm Signed-off-by: sven1977 <[email protected]> # Conflicts: # rllib/core/learner/tests/test_learner_group.py

fix

fa63e33

Signed-off-by: sven1977 <[email protected]>

wip

277e057

Signed-off-by: sven1977 <[email protected]>

fix

8fae002

Signed-off-by: sven1977 <[email protected]>

fix

308e161

Signed-off-by: sven1977 <[email protected]>

sven1977 enabled auto-merge (squash) November 7, 2024 07:28

wip

3f31afa

Signed-off-by: sven1977 <[email protected]>

github-actions bot disabled auto-merge November 7, 2024 10:52

sven1977 enabled auto-merge (squash) November 7, 2024 12:09

sven1977 merged commit 359f40a into ray-project:master Nov 7, 2024
6 checks passed

sven1977 deleted the fix_impala_gpu_loader_thread_and_local_learner branch November 7, 2024 13:04

sven1977 restored the fix_impala_gpu_loader_thread_and_local_learner branch November 8, 2024 08:24

sven1977 deleted the fix_impala_gpu_loader_thread_and_local_learner branch November 11, 2024 18:55

JP-sDEV pushed a commit to JP-sDEV/ray that referenced this pull request Nov 14, 2024

[RLlib] Fix IMPALA/APPO learning behavior: Fix EnvRunner sync bug, GP…

a2278e4

…U loader thread, enable local learner w/ GPU. (ray-project#48314) Signed-off-by: JP-sDEV <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Fix IMPALA/APPO learning behavior: Fix EnvRunner sync bug, GPU loader thread, enable local learner w/ GPU. #48314

[RLlib] Fix IMPALA/APPO learning behavior: Fix EnvRunner sync bug, GPU loader thread, enable local learner w/ GPU. #48314

sven1977 commented Oct 29, 2024 •

edited

Loading

aslonnie left a comment

[RLlib] Fix IMPALA/APPO learning behavior: Fix EnvRunner sync bug, GPU loader thread, enable local learner w/ GPU. #48314

[RLlib] Fix IMPALA/APPO learning behavior: Fix EnvRunner sync bug, GPU loader thread, enable local learner w/ GPU. #48314

Conversation

sven1977 commented Oct 29, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

aslonnie left a comment

Choose a reason for hiding this comment

sven1977 commented Oct 29, 2024 •

edited

Loading